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Abstract 

We study classical simulation of quantum computation, taking the Gottesman-Knill theo- 
rem as a starting point. We show how each Clifford circuit can be reduced to an equivalent, 
manifestly simulatable circuit (normal form). This provides a simple proof of the Gottesman- 
Knill theorem without resorting to stabilizer techniques. The normal form highlights why 
Chfford circuits have such limited computational power in spite of their high entangling 
power At the same time, the normal form shows how the classical simulation of Clifford cir- 
cuits fits into the standard way of embedding classical computation into the quantum circuit 
model. This leads to simple extensions of Clifford circuits which are classically simulatable. 
These circuits can be efficiently simulated by classical sampling ("weak simulation") even 
though the problem of exactly computing the outcomes of measurements for these circuits 
("strong simulation") is proved to be #P-complete — thus showing that there is a separation 
between weak and strong classical simulation of quantum computation. 

1 Introduction 

The Gottesman-Knill theorem fT] is a well-known result in quantum information theory which 
states that a certain class of non-trivial quantum circuits, called "Clifford circuits", can be simu- 
lated efficiently on a classical computer, and can hence not provide any speed-up w.r.t. classical 
computation. A Clifford circuit is any quantum circuit which is composed of Hadamard, PHASE 
and CNOT gates. The Gottesman-Knill theorem asserts that each (uniform family of) Clifford 
circuit(s), when acting on the computational basis state |0) = |0)^, and when followed by a 
computational basis measurement, can be simulated efficiently on a classical computer 

While in fact not so hard to prove, this result exhibits some rather remarkable and sometimes 
puzzling features, not all of which are fully understood. For example, even though they are effi- 
ciently classically simulatable, Clifford circuits can generate a high degree of entanglement ||2l; 
the highly entangled cluster states [3] can e.g. be generated by Clifford circuits. This very feature 
raises doubts about the often-recited mantra that "entanglement is responsible for the quantum 
computational speed-up". In particular, it highlights that, while the presence of (certain types of) 
entanglement in a quantum computation (QC) is provably necessary to disallow efficient classical 
simulation (see e.g. ||41|5]|61I21[Eii2.l10J)> it is certainly not sufficient. 
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Further, it is known that Chfford circuits can efficiently be simulated classically by a rather 
restricted classical computer, namely a circuit model computer which only uses NOT and CNOT 
gates [llj. In other words, it is not necessary to invoke the full power of classical computers to 
efficiently simulate arbitrary Clifford circuits. Hence, Clifford circuits are most likely not even 
universal for classical computation. In complexity theoretic terms, the power of classical compu- 
tation with NOT and CNOT gates — and hence of Clifford circuits — is captured by the complexity 
class ©L ("parity-L") fT2l[T3l . This class is known to be contained in P but not expected to be 
equal to it (although this is an unproven conjecture). Nevertheless, supplementing Clifford opera- 
tions with essentially any non-Clifford gate immediately yields the full QC model |[T4l|T5l . This 
yields an immediate "jump" in computational power from ©L to BQP rather than a "smooth" 
transition ©L — > BPP —>■ BQP. In particular, this property makes it hard to extend Clifford 
circuits to a class of efficiently simulatable quantum circuits which has the same computational 
power as full classical computation. 

It is the aim of this note to obtain some insight in the above list of features. First we will 
show that each Clifford circuit C can be (efficiently) reduced to an equivalent circuit C which 
yields the same output. The "normal form" C' is also a Clifford circuit but has a very simple 
structure, as displayed in Fig. [2| this normal form is based on an earlier result |[T6l . for which we 
also provide a simple proof. We will see that C is manifestly efficiently classically simulatable. 
We argue that the normal form also sheds some light on why the high degree of entanglement 
and interference generated by Clifford operations does not result in any (exponential) quantum 
computational speed-up. Finally, the normal form shows how the simulation of Clifford operations 
fits within the standard embedding of probabilistic classical computation into the quantum circuit 
model. The latter is related to the notion of HT circuits, as discussed in section |4] 

Along the way, we will make some general remarks regarding classical simulation of QC. One 
of them regards the different possible definitions of the notion of "classical simulation". When 
a QC is to be simulated classically, the aim may be to either (i) compute the probabilities of 
the output measurement efficiently classically with high accuracy ("strong simulation") or (ii) 
sample from this distribution efficiently using a classical computer ("weak simulation"). Both 
variants constitute valid classical simulation techniques; however, a priori it is not clear whether 
there is a clear separation between these two notions. The vast majority of all works (see e.g. 
|[I]|B|7l|9l[l0l[l8l|l9l|20l|2lll2l|23l|2a |25 1) regarding classical simulation of QC considers 
quantum circuits which can be simulated efficiently classically in the strong sense, whereas not 
much is known about weak simulation. 

We will find that the simulatable quantum circuits considered in the present paper are not 
amenable to strong simulation — in fact, it is easy to show that strong simulation of these circuits 
constitutes a ^^P-complete problem. Nevertheless, weak classical simulation of the same circuits 
is efficiently possible by means of a very simple sampling technique. This shows that strong and 
weak simulation of QC are fundamentally different notions. Moreover, these examples highlight 
that any serious attempt to understand the relationship between classical and quantum computation 
should not rely on the notion of strong simulation. 
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2 Classical simulation of quantum computation 



Here we state more precisely what we mean by "efficient classical simulation of quantum compu- 
tation". We essentially follow the definitions from ll4ll2TI. 

Consider a uniform family of quantum circuit^ = Un acting on the A^-qubit input state 
|0) = |0)®^, and followed by a measurement of, say, the first qubit in the computational basis. 
Then this quantum computation yields as an outcome a bit a G {0, 1}. The probability that the 
outcome a occurs is given by 7r(a) = (0|Z//t[|a)(a| I]U\Q). We say that the above quantum 
computation can be efficiently simulated classically in the strong sense if it is possible to evaluate 
7r(0) up to M digits in poly(A^, M) time on a classical computer. Furthermore, we say that the 
quantum computation can be efficiently simulated classically in the weak sense if it is possible to 
sample once from the probability distribution {7r(a)} in poly(A^) time on a classical computei]^ 

We further point out that to date the vast majority of results on simulation of quantum com- 
putation regard strong classical simulation. This regards e.g. the Gottesman-Knill theorem, the 
simulation of "matchgates" lITSl [T9l l20l 1211 . the simulation of the quantum Fourier transform 
Il22ll23ll24ll . simulation results involving tensor contracting techniques |l6j|7]|9l[T0l, etc. However, 
in order to study the difference between classical and quantum computation, the notion of weak 
classical simulation is much more appropriate. In fact, the strong simulation of general quantum 
circuits can easily be shown to be a #P-hard problem (this can e.g. been showed by considering 
the the family of circuits Uf as in section|4jl. As BQP is believed to be much smaller than ^^P, 
this indicates that strong classical simulation inevitably seems too strong a requirement to study 
the relation between classical and quantum computation. However, not much is known about the 
difference between strong and weak simulation of quantum computation. In section[4]we will give 
an example of a class of quantum circuits which is efficiently simulatable in the weak sense but 
which is intractable in the strong sense (unless P is equal to #P). 

3 The Gottesman-Knill theorem 

In this section we recall the Gottesman-Knill theorem and the relation between Clifford operations 
and the complexity class ©L. (A variant of) the Gottesman-Knill theorem can be formulated as 
follows H: 

Theorem 1 (Gottesman-Knill) Every (uniform family of) Clifford circuit(s), when applied to the 
input state |0) = |0)®^ and when followed by a Z measurement of the first qubit, can be efficiently 
simulated classically in the strong sense. 

The standard technique to prove this result involves a connection between Clifford operations and 
groups of commuting Pauli operations called "stabilizer groups". We omit this proof here and 
refer to e.g. |T, '261. Below we will provide an alternative proof of the Gottesman-Knill theorem 
which does not use stabilizer techniques. 

'in the following, when we refer to a "quantum circuit" we will always mean a uniform family of quantum circuits. 

^More precisely, one should require to sample from a probability distribution which is not necessarily exactly equal 
to {7r(a)}, but sufficiently close to it (see e.g. |4|). In the examples in this paper, however, it will always be possibly to 
perform an exact sampling, such that we omit such accuracy issues for simplicity. 
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It turns out that a rather restricted classical computer suffices to efficiently simulate arbitrary 
Clifford circuits. Indeed, every Clifford circuit can be efficiently simulated by a classical circuit 
model computer which uses NOT and CNOT gates only, applied to the input string = (0, . . . , 0) 
|[TT]| . More precisely, in [11] the problem Gottesman-Knill was defined as follows: the input 
is an n-qubit Clifford circuit C; the problem is to decide whether the first qubit will be in the 
state |1) with certainty after C has been applied to the input |0)". It was showed in [11] that 
this problem can be mapped, under a logarithmic-space reduction, to a problem of simulating a 
classical poly-size CNOT-NOT circuit acting on the all-zeroes input state. The complexity class 
of problems that are log-space reducible to the simulation of CNOT-NOT circuits, is called ©L 
("Parity-L") Q M 1131 . Hence, it was showed in [11] that Gottesman-Knill is in ©L. As 
CNOT and NOT gates are Clifford operations, it is clear that every ©L problem can be reduced to 
Gottesman-Knill. In conclusion, the problem Gottesman-Knill is ®L-complete [11]. 

The class of problems in ©L are centered around Unear algebra over the finite field Z2. It 
has been shown that the problems of solving linear equations over Z2, finding the inverse of a 
nonsingular Z2-matrix, multiplying matrices over Z2 etc., are ©L problems |fT3l. 

4 Embedding classical in quantum computation 

Here we briefly review how classical (probabilistic) computation can be regarded as being a "part" 
of quantum computation. All material in this section is standard (see e.g. [26]), except for an 
observation about the difference between strong and weak simulation of quantum computation, 
which will be made at the end of this section. 

Probabilistic classical computation is classical computation (e.g. considered in the circuit 
model) supplemented with the possibility of deciding, in each step of the computation, which gate 
to apply based on the random outcome of a coin toss. It is well understood how probabilistic clas- 
sical computation can be embedded in the full pure-state circuit model of quantum computation. 
The subclass of quantum circuits that corresponds to probabilistic classical computation is the fol- 
lowing: (ROUND 1) apply Hadamard gates to an arbitrary subset of qubits; (ROUND 2) apply a 
(uniform family of) circuit(s) consisting of classicaj^ gates only (e.g. NOT, CNOT, Toffoli gates). 
A quantum circuit with the above structure will be called an HT circuit (short for Hadamard- 
Toffoli); see also Fig[T] It can easily be shown that every probabilistic classical computation can 
be accounted for by a suitable HT circuit acting on the all-zeroes input state and followed by a 
single-qubit Z measurement (say, on the first qubit). Conversely, it is also sti^aightforward to see 
that each HT circuit (with input and measurement as above) can be simulated efficiently — in the 
weak sense — by a probabilistic classical computer: if the quantum register has N qubits, if m 
denotes the number of Hadamard gates applied in ROUND 1, and if the classical gates in ROUND 
2 compute an efficiently computable (invertible) function / : {0, 1}^ {0, 1}^, then the state 
of the system after the first two rounds has the form Ylix^{o 1}™ 0) • • • > 0))) where there are 
N — m zeroes in the argument of /. We will use the abbreviation F{x) = f{x, 0, . . . , 0). Per- 
forming a {|0), |1)} measurement on the first qubit yields as an outcome a bit a G {0, 1}. The 

^Obviously, ©L C P; the inclusion is believed to be strict, but a proof of this conjecture has not yet been found. 
''We call any unitary gate U a "classical gate" if it maps computational basis states to computational basis states. 
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Figure 1: The class of quantum circuits with the above structure are here called HT circuits (first 
a round of Hadamard gates is applied to a subset of the qubits, followed by a round of "classical 
gates", i.e. TOFFOLI, NOT, etc.). HT circuits followed by a computational basis measurement have 
the same computational power as probabilistic classical computation: each probabilistic classical 
computation can be simulated efficiently by an HT circuit, and each HT circuit can be weakly 
simulated efficiently on a classical computer. However, strong classical simulation of HT circuits 
is a #P-complete problem. 

probability that a occurs is equal to 

|{x£{0,ir :F(x)i = a}| 
<ol) = — , (1) 

where F{x)\ denotes the first bit of F(x). Sampling from this probability distribution can easily 
be done classically: simply generate an m-bit string x uniformly at random, then compute F{x) 
and finally set a := F{x)i; then a is generated with probability 7r(a) as desired. This shows that 
weak classical simulation of HT quantum circuits is efficiently possible. We can therefore con- 
clude that the computational power of HT quantum circuits is equivalent to probabilistic classical 
computation. 

We remark that strong classical simulation of HT circuits is much harder — in fact, we point out 
that such strong simulation is a ^^P-complete problem. The complexity class ^^P is concerned 
with counting problems: given an efficiently computable Boolean function / G P, the problem 
of determining the integer #f := |{x : f{x) = 0}| defines the complexity class ^^P. It is 
easy to see that the problem of computing the probabilities ([T]) is at least as hard as computing 
#f for arbitrary functions / G P. To show this, consider an arbitrary Boolean function / : 
{0, 1}"* {0, 1} which is in P. Then there exists a poly-sized HT quantum circuit U such 
that h(\0) oc Ylx Performing a measurement in the computational basis on the last 

qubit yields a single bit with probability distribution 7r(0) = #//2™ = 1 — vr(l). Computing 
these probabilities with perfect accuracy is thus equivalent to computing #f. This shows that 
strong simulation of HT circuits is at least as hard as any problem in #P an is thus #P-hard. 
Moreover, it can easily be shown that, for arbitrary HT circuits and for arbitrary computational 
basis measurements, the problem of computing the probabilities Q is always in #P. This shows 
that the strong simulation of HT circuits is a ^^P-complete problem. In conclusion, efficient strong 
classical simulation of HT quantum circuits is intractable (unless P is equal to #Pj, while weak 
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simulation of these circuits is efficiently possible (even trivially so). 



5 Clifford circuits as HT circuits 

In this section we show that, modulo some "redundancies" (to be explained below), each Clifford 
circuit can be reduced to an HT circuit, which is manifestly classically simulatable. To this aim, 
we provide an alternative proof of the Gottesman-Knill theorem which will allow to make the 
connection to HT circuits. The proof will not be centered around stabilizer groups (and hence 
does not require any knowledge of the stabilizer formalism) but will simply consist of tracking the 
coefficients ipx of the state |^) = J2x''P'-c\^) °f quantum register throughout the application 
of an arbitrary Clifford circuit. To achieve this, it will be useful to know what the most general 
form is which the coefficients tp^ may assume. The latter issue has in fact already been fully 
investigated and understood in [16.1 . where the following result was proved: every A^-qubit state 
= C|0)^, where C represents a Clifford circuit, is given by an expression of the form 



where we have the following notations. First, A C Z2 is an affine subspace, i.e. a subset of 
the form A = {Ru + t\u € Z™}, for some fixed (nonsingular) N x m matrix R and vector 
t e Z2 ■ Second, l{x) is a linear function on Z^, i.e., I maps x to l{x) = (f"x for some d € Z^. 
The exponent l{x) of i is computed modulo two. Finally, q{x) is a quadratic function on Z^. 
That is, q maps A^-bit strings x = {xi, . . . , xj\[) to q{x) = CijXiXj + CjXj, for some (fixed) 
Cij,Ci G {0, 1}. Moreover, q, I and A can be efficiently computed. Conversely, every state of the 
form Q is a stabilizer state. 

The proof that an arbitrary Clifford circuit composed always maps |0) to a state of the form 
(|2]) was first derived in fW\ and is based on mappings between Clifford operations, the stabilizer 
formalism and arithmetic over Z2. Here we will give a simple alternative proof of this theorem. 

Proof of Eq. ([2]): The proof can be obtained by induction on the number of gates in the circuit 
C. Evidently, Q is true when C is the identity. Now suppose that the result is true for every Clifford 
circuit consisting of K gates. We then have to prove that the result still holds for circuits with one 
additional gate. Consider an arbitrary circuit C of size K + 1, and write C as a product of a single 
Clifford gate U £ {H, P, CNOT} and a circuit C of size K:C = UC. By the induction step, we 
may assume that there exist A, I and q such that C'|0) is given by ([2]). It is then simply a matter 
of verifying that the general form (j2]l is kept when a gate from the set {H, P, CNOT} is applied. 
This can be shown using straightforward arithmetic, and for completeness these calculations are 
given in the appendix. □ 

The above proof of expression Q leads to a simple algorithm to compute A, I and g if a 
circuit C is given: first, one initializes A, I and q to their trivial values, corresponding to the state 
|0). Then one sequentially updates A, I and q corresponding to the first, second, etc. gate in C. 
The final values for A, I and q then correspond to the state C|0). 

It is now easy to see that each Clifford computation can be simulated efficiently. If a circuit 
C is given, the first step is to compute the triple {A, I, q) parameterizing C|0). The goal is then 
to efficiently simulate {|0), |1)} measurements on this state. To do so, one argues at follows. 




(2) 
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First, the outcome probabilities of a measurement in the computational basis are independent of 
the phases That is, the specific values of / and q are in this context completely 

redundant, such that we may set these quantities to their trivial values / = 0, (7 = 0. This leaves us 
with the state Yl,x^A 1^)- '■'^^ affine space A be given hy A = {Ru + t : n G Z™}, for some 
invertible N x m matrix R and t G Z^. Further, suppose that the first qubit is measured, yielding 
as an outcome a bit a which occurs with probability 

\{x£A:xi= a}\ 
7r(a) = 1^^^ ■ (3) 

It is trivial to sample from this probability distribution: simply generate a uniformly random m-bit 
string u, compute Ru + t and set a to be the first coefficient of Ru + t. This shows that each 
Clifford circuit can efficiently be simulated in the weak sense. 

However, one can do more than this, as the probabilities 7r(a) can be computed efficiently 
as well. Indeed, each of these probabilities has the form 2^'^ for some integer 7 which can be 
obtained by solving a system of linear equations over Z2. Hence, strong simulation is possible as 
well, thus recovering the Gottesman-Knill theorem. 



6 Normal form 

Next we note that the expression Q provides an alternative way — different than the circuit C — to 
prepare the output state C|0). If the affine space A is given by ^4 = {Ru + t : n S Z™}, then 
this state can be prepared from |0) as follows. First, apply m Hadamard gates such as to yield 
Z^M |^i)|0)^~'". Second, apply appropriate CNOT gates such as to yield \Ru). Third, apply 
the NOT operation X^^ . . . X^^ , yielding J2u 1-^^ + Fourth, apply appropriate PHASE and 
CPHASE gates such as to yield C|0). The overall structure of this preparation method is illustrated 
in Fig. |2] We have now arrived the following result. 

Theorem 2 Let C be an arbitrary poly-sized Clifford circuit. Then there exists a poly-sized Clif- 
ford circuit C satisfying C\0) = C'\0) such that C can be decomposed into three "rounds": 
(ROUND 1) apply Hadamard gates to an arbitrary subset of qubits; (ROUND 2) apply a poly- 
sized circuit ofNOTs and CNOTs; ( RO UND 3 ) apply a poly-size circuit ofPHASEs and CPHASEs. 
The circuit C can be efficiently determined. 

It is important to remark that it is generally not true that the circuits C and C are equal as 2^ X 2^ 
matrice^ These circuits merely have the same effect on the input state |0), i.e., they generate 
the same output state. Note that it is the output state \il^out) = C|0) of the computation which 
is relevant for our purposes — i.e. a single column of the 2^ x 2^ matrix C — and not the entire 
circuit. We further emphasize that the choice of a standard input state |0) as opposed to arbitrary 
products of X, Y, Z eigenstates as inputs, does not entail any loss of generality, as such alternative 
input states can always be rotated into the state |0) using a (local) Clifford operation. 

The Clifford normal form highlights the computational "weakness" of Clifford circuits. First, 
the original Clifford circuit, which may contain many Hadamard gates located at different places in 

^ A counterexample is e.g. given by the single-qubit circuit C = HPH. By exhaustive enumeration of all possibilities, 
it can easily be shown that no single-qubit circuit C' of the above structure satisfies C = C . 
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the circuit, causing subsequent rounds of constructive and destructive interference, is (efficiently) 
mapped to the normal form which does not display any interference at all. Indeed, up to the irrel- 
evant last round of diagonal gates, each Clifford circuit is reduced to a circuit of NOT and CNOT 
gates applied to a superposition of computational basis states |+)^|0)^ — in other words, nothing 
but a (very simple instance of an) HT circuit, which are trivially classically simulatable. Also the 
connection between Clifford circuits and the complexity class ©L is in this way highlighted, given 
the relation between the class ©L and circuits consisting of NOT and CNOT operations. 

With a little extra work, theorem|2]may be used to arrive at a normal form for Clifford circuits 
C which does regard the entire 2^ x 2^ matrix. The normal form is in fact highly similar to 
theorem |2] The significance of this normal form is again that it contains only a single round of 
basis-changing operations (i.e. Hadamards). In fact, we will see that every Clifford operation 
can be written as a a tensor product of Hadamard operations H = (g) I acting nontrivially 
on a subset S of the qubits, multiplied on the left and the right with basis-preserving Clifford 
circuits Mi and M2, i.e. circuits composed of CNOT, PHASE and CPHASE gates. Similar to Q, 
(a variant of) theorem |3] was proved in [T6l using mappings between the stabilizer formalism, 
Clifford operations and Z2-arithmetic; here we provide a direct proof. 

Theorem 3 Let C be an arbitrary n-qubit Clifford operation. Then there exist: (a) poly-size 
circuits Mi and M2 composed o/CNOT, PHASE and CPHASE gates and (b) a tensor product of 
Hadamard gates and identities 7i = © / acting nontrivially on a subset S of the qubits, 
such that C oc M2'HMi. Moreover, Mi, M2 and 7i can be determined efficiently. 

Proof: Let a = (ai, . . . , a„) denote an arbitrary n-bit string. For every i = 1, . . . , n, define 
(Tj := CXiC^ where Xi denotes the Pauli matrix X acting on qubit i. Since C is a Clifford 
operation, each is a Pauli operatoij^ possibly with an overall minus sign. Denoting X{a) = 
X"^ (g) . . . (g) X"" and a{a) := Hi '^T ^ we thus have a{a) := CX{a)C'<. Since \a) = X(a)|0)", 
we have C\a) = CX(a)|0)" = cr(a)C|0)". We can now apply theorem |2] to the state ClO)": 
there exists a poly-size Clifford circuit M consisting of PHASE, CPHASE and CNOT gates, and an 
operation H = © I such that C|0)" = 7M7^|0)"', where 7 is some overall phase factor. Now 
define n := HM'^aiMH for every i; also Tj is a Pauli operator, possibly supplemented with an 
overall minus sign. Denoting r(a) := J|- r"', we thus have T(a) := TiM'^ a{a)MT-L. This leads 
to 

C\a) = 7cj(a)MW|0)" = 7MWr(a)|0)". (4) 

Note that 7 does not depend on a. 

We now study the general form of the unitary mapping \a) r(a) |0)". Since each Tj is an 
element of the Pauli group, there exist n-bit strings u = (ui, . . . , v = (wi, . . . , i2* = 
{Bii, ...,Ri) and T' = {Tl, ...,T^) such that n = (-l)"n^'X(i?^)Z(r*) and hence 

r(o) = [|(-l)'^'">i"'^>X(aiii^)Z(air). (5) 



^A Pauli operator has the form P — Pi <Si . . . <Si Pn where each Pi is either the identity or one of the Pauli matrices 
X, Y or Z. 
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Figure 2: For each Clifford circuit C acting on the computational basis state |0) (left), there 
exists a normal form C (right) of the above structure such that C|0) = C'|0), i.e., these two 
circuits yield the same output state. The last round, consisting of diagonal gates, is undetected 
by a computational basis measurement and can therefore be completely disregarded. Moreover, 
the first two rounds in C constitute a (restricted) HT circuit and hence are trivially classically 
simulatable. 

For every two x,z G Zg, we have X{x)Z{z) = (—1)^ ^ Z{z)X{x). Hence, we may reshuffle 
the factors in the product (jsjl at the cost of an overall phase factor of the form (— for some 
quadratic function g : ^ Z. In particular, we may write 

r(a) = {-lY^'''^\{e^''^\{x{a^W)\[z{a.iT'). (6) 

i i i 

Letting R (T) denote the n x n matrix with columns (T*), we then have 

r(a) = JJi'^''''X(i2a)Z(ra). (7) 

i 

The action of T(a) on |0)" then has the following form: 

r(a)|0)" = (-l)'?('^)JJi'^'^>X(iia)Z(ra)|0)" = JJi'^'^'^liJa), (8) 

i i 

where we have used that Z(ra)|0)" = |0)" and that X(i?a)|0)" = \Ra). It then easily follows 
that there exists a Clifford circuit M' (independent of a) composed of CNOT, PHASE and CPHASE 
gates such that T(a)|0)" = M'\a). Together with this completes the proof. □ 

7 Slightly beyond Gottesman-Knill 

The Clifford normal form also makes it easy to extend the Gottesman-Knill theorem. Next we pro- 
vide a class of quantum circuits which are efficiently simulatable classically (in the weak sense), 
and which are extensions of Gottesman-Knill in the sense that these circuits can efficiently gen- 
erate any output state which can be efficiently generated by arbitrary Clifford circuits. Moreover, 
contrary to Clifford circuits, the extensions encompass full probabilistic classical computation. 
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We consider circuits of the following structure: first, apply an arbitrary local unitary operation; 
afterwards, apply an arbitrary (uniform family) of quantum circuit(s) consisting of Toffoli and 
diagonal gates only. We show that such circuits, when applied to the input |0) and when followed 
by a computational basis measurement, can be efficiently simulated classically in the weak sense. 
After the first round, the state of the quantum register is a complete product state \x) = |xi) ® 
• • • ^ \xn)- Denoting \xi) = ai\0) + bi\l), the coefficients Xx of \x) in the computational basis 
(where x = (xi, . . . ,xj\f) is an A^-bit string) are given by Xx = Yli(^i~^'bi^ ■ The final state 
of the quantum register has the form Xxdx\f{x)), where the 9x are complex phases (which 
are efficiently computable as a function of x, but this will not be relevant here) and where / : 
{0, 1}^ —>■ {0, 1}^ is an invertible, efficiently computable Boolean function. Further, suppose 
that a subset S of the N qubits is measured, yielding as an outcome a bit string of a = {aj : 
J G 5), where the bit aj is the measurement outcome for qubit j. Then a occurs with probability 
''^{oi) = J2\Xx\'^, where the sum is over all bit strings x such that f{x)j = aj for all j G S. 
To state it differently, 7r(a) is the total probability that an A^-bit string x which is generated with 
probability p{x) = |xa;P> satisfies f{x)j = aj for all j E S. Now, it is possible to efficiently 
sample from the probability distribution {p{x)}. As a consequence, the following procedure (i)- 
(ii)-(iii) allows to efficiently generate a bit string a with probability 7r{a) on a classical computer: 
(i) generate an A^-bit string x with probability p{x) = IxxP (note that {p{x)} is a simple product 
distribution since |x) is a product state); (ii) compute /(x); (iii) set aj to be the j-th coefficient 
of f{x), for every j E S. This shows it is possible to efficiently sample from the probability 
distribution {7r(a)}. 

Note that the first round, consisting of a local unitary operation, can be generalized while 
keeping the entire circuit classically simulatable. To do so, consider a quantum circuit U such that 
the computation |0) —>■ L{\0), followed by a computational basis measurement of all N qubits, 
can be efficiently simulated classically in the weak sense. That is, it is classically possible to 
efficiently sample once from the probability distribution p{x) = \{x\l/(\0)\'^. Then such a cir- 
cuit, followed by an arbitrary (uniform) quantum circuit consisting of classical and diagonal gates 
only, and followed by a computational basis measurement of an arbitrary subset of the qubits, can 
be simulated efficiently classically in the weak sense. This last observation e.g. implies that a 
"matchgate circuit" Ull [JH |20l ED, a circuit of "bounded tree-width" HElIll, a circuit which 
generates "bounded Schmidt-rank" ||5][T0l, or a quantum Fourier transform Il22l |23l |24|| (which 
are all known to be strongly simulatable), followed by an arbitrary poly-sized ToffoU-Diagonal 
circuit, is still weakly simulatable. 

8 Conclusion 

We have studied classical simulation of quantum computation, taking the Gottesman-Knill theo- 
rem as a starting point. We have showed that each Clifford circuit C can be reduced to a simple 
equivalent circuit C which, when applied to the input |0), provides the same output state. Us- 
ing this reduction to the normal form, we have attempted to provide a better understanding in the 
somewhat peculiar features of the Gottesman-Knill theorem. We have argued that the normal form 
provides insight in why the large amount of entanglement which can be generated by Clifford cir- 
cuits fails to provide any quantum computational speed-up. Furthermore, the normal form shows 
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how the simulation of Clifford operations fits within the standard embedding of probabilistic clas- 
sical computation into the quantum circuit model, related to the notion of HT circuits. 

The class of HT circuits considered here, while indeed simple, exhibits features which are 
quite different from many results regarding classical simulation of quantum computation which 
have been found so far (such as the Gottesman-Knill theorem itself, matchgates, circuits of small 
tree- width, etc). For example, while such circuits are classically simulatable in the weak sense 
(i.e. they can be simulated using classical sampling techniques), strong classical simulation (i.e. 
the problem of computing the output probabilities of measurement outcomes with high accuracy) 
is #P-complete and hence intractable. This is in contrast with previous results, the vast majority 
of which considering circuits where strong simulation is efficiently possible. Also, HT circuits can 
generate unbounded amounts of entanglement (as can e.g. Clifford circuits) while several previous 
results use techniques which allow classical simulation of certain quantum circuits only in cases 
where the entanglement generated by these circuits is "bounded" (see e.g. ||5] |6l HOl). Thus, the 
present results provide simple examples of quantum circuits which can generate vast amounts of 
entanglement, which nevertheless does not result in any quantum computational speed-up. 

We further note that a very moderate extension of HT circuits immediately leads to highly 
non-trivial quantum algorithms. For example, it is known that Shor's factoring algorithm, using 
the phase estimation approach [30] (see also 1311), can be implemented by an HT circuit sup- 
plemented with a final round of Hadamard gates, and then followed by a measurement in the 
computational basis. This shows that a mere final round of local basis changes allows to go from 
an HT circuit, which is trivially classically simulatable, to Shor's factoring algorithm, which is 
believed to provide an exponential speed-up. 

Acknowledgements. I am very grateful to R. Jozsa, W. Diir, H. Briegel, A. Kay and I. Cirac 
for interesting discussions and suggestions on the manuscript. Work supported by the Excellence 
Cluster MAP. 

Appendix A: Updating the triple {A, /, q) 

Here we show the following claim: Claim: suppose that an A^-qubit state \ip) is given in terms of 
an expression of the form (jljl for some (A, Z, q), and suppose that a gate U G {H, P, CNOT} is 
applied to resulting in a state {ip'). Then \ip') still has the form (jlj) for some updated {A, I, q), 
and that these updates can be performed efficiently. 

Before proceeding with the proof, we note the following: in (|2]), the functions q and / are 
defined on the space of N-hit strings Z^. If the affine space A is given by yl = {Ru+t : u G Z™}, 
we might as well write 



where now q and I are quadratic, resp. linear functions defined on the space of m-bit strings 
u, such that q{u) = q{Ru + t) and l{u) = l{Ru + t). Note that it is computationally easy to 
determine q (l) from q (l) and vice versa. In the following we will be a bit sloppy and simply 
identify q{u) = q{Ru + t) and similar for I; it will be clear from the context what the notation 
means. 

We will also need the following lemma: 




(9) 



u 
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Lemma 1 Let q{x) andl{x) be quadratic and linear functions, resp., on Z^, where x = (xi , . . . , xk)- 
Denote x = (x2, • • • , xk)- Then there exist a quadratic function q' and two linear functions V and 
I" on such that 

oc (10) 

where oc denotes equality up to a multiplicative constant independent ofx. Moreover, determining 
q' , I' and I" from q and I can be performed efficiently. 

The proof of the lemma is straightforward and is omitted. We now proceed with the proof of the 
claim. 

Proof of claim: First, suppose that IV') is given by Q and that a PHASE gate is applied to, 
say, the first qubit. As P maps \a) to i'^\a) (a = 0, 1) the resulting state \ip') is 

IV'') oc (11) 

where xi is the first component of x. Using the identit>|^f"z'' = (— for every a, 6 G Z2, 
then shows that is again of the form 

Second, suppose that a CNOT gate is applied to, say, the first and second qubit of Note 
that CNOT maps |a, b) to \a, a + h), for every a, 6 = 0, 1. That is, CNOT performs a linear 
transformation (over Z2) "within the ket". This immediately implies that the state ^ is mapped 
to 

IV') oc (-l)5(")i'(")|i?'u + t'), (12) 

for some appropriate B! and t' which are easily determined by performing the CNOT. Thus, also 
in this case the form (|2]) is kept. 

Finally, suppose that a Hadamard gate is applied to, say, the first qubit of\'ijj). This is the most 
nontrivial case of the three. Recall that H maps \a) to X]fe=o(~^)"^l^) (where a = 0, 1). Denoting 
the first row of R by r^, letting R be the {N — 1) x m matrix obtained by removing the first row 
of R and denoting t = (^2, • • • , t^), one has 

1 

Note that, as R has full rank m, R may have either full rank m, or rank m — 1. If R has full 
rank, then (13 1 is of the form ^ and we are done. If R has rank m — 1, some additional work 
is required. In this case, there is exactly one nontrivial linear combination of the columns of R 
which yields the identity. Without loss of generality, we may assume that the first column of R 
can be written as a linear combination of the other columns, and that the last m — 1 columns of R 
are linearly independent. Denoting the columns of .R by c* (i = 1, . . . , m), we therefore conclude 



''Here, the exponent a + bis computed modulo 2. 
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that there exists a bit string y = {y2, ■ ■ ■ , Um) such that = 2 Vi^^- Note that determining 
y is a ©L problem as y is the solution to a system of linear equations over Z2. It is then easy to 
verify that RQ = [0|c^| . . . |c™], where the m x m invertible matrix Q is defined by 



Q 



1 

y2 1 



Making the substitution u = Qu' in ( 13 1 yields an expression of the following form: 



(14) 



(15) 



for some quadratic and linear functions q' and /' which can be easily determined (by a ©L com- 
puter). To lighten notation we will drop all primes in the above expression, i.e., u' = u, q' = q, 
I' = I. Now note that the vector RQu does not depend on the first coefficient of u, since the first 
column of RQ is zero. Indeed, denoting u = {u2, ■ ■ ■ , Um), we have RQu = [c^| . . . \ c^]u (note 



also that [c^| . . . |c™] has full rank). Therefore, in ( 15l the variable ui can be fully summed out 
Using lemma[T]then shows that is again of the desired form ([2]). This completes the proof. 



References 

[1] D. Gottesman, talk at International Conference on Group Theoretic Methods in Physics 
(1998), .arXiv:quant-ph/9807006, 

[2] M. Hein, W. Diir, J. Eisert, R. Raussendorf, M. Van den Nest and H. J. Briegel, Proceedings 
of the International School of Physics "Enrico Fermi" on "Quantum Computers, Algorithms 
and Chaos" (2005); see also arXiv:quant-ph/0602096 

[3] H. J. Briegel and R. Raussendorf, Phys. Rev. Lett. 86, 910-913 (2001). 



[4] R. Jozsa and N. Linden, arXiv:quant-ph/020 I I43| 
[5] G. Vidal, Phys. Rev. Lett. 91, 147902 (2003). 



[6] I. Markov and Y. Shi, arXiv:quant-ph/051 1069 



[7] Y. Shi, L.-M. Duan and G. Vidal, Phys. Rev. A 74, 022320 (2006). 



[8] N. Yoran and A. Short, quant-ph/0601178 



[9] R. Jozsa, arXiv:quant-ph/0603I63 



[10] M. Van den Nest, W. Dur, G. Vidal and H. J. Briegel, Phys. Rev. A 75, 012337 (2007). 
[11] S. Aaronson and D. Gottesman, Phys. Rev. A 70, 052328 (2004). 



13 



[12] C. H. Papadimitriou and S. K. Zachos, preprint MIT/LCS/TM-228, MIT (1982). 

[13] C. Damm, Inform. Process. Lett. 36, 247 (1990). 

[14] G. Nebe, E. M. Rains and N. J. A. Sloane, Designs, Codes and Cryptography 24, 99 (2001). 

[15] R. Solovay, talk at Mathematical Sciences Research Institute (2000). 

[16] J. Dehaene and B. De Moor, Phys. Rev. A 68, 042318 (2003). 

[17] C. M. Papadimitriou, Computational complexity (Addison-Wesley, Reading, MA, 1994). 

[18] L. G. Vahant, Proc. ACM Symp. on Theory of Computing (2001), p. 1 14. 

[19] D. DiVincenzo and B. Terhal, |arXiv:qua nt-ph/040303T] 

[20] S. Bravyi,|E Xiv:080I.2989 



[21] R. Jozsa and A. Miyake, arXiv:0804.4050 

[22] D. Aharonov, Z. Landau, J. Makowsky, lquant-ph/061 1 156 



[23] N. Yoran and A. Short, Phys. Rev. A 76, 042321 (2007). 

[24] D. E. Browne, New J. Phys. 9 146 (2007). 

[25] S. Bravyi and R. Raussendorf, Phys. Rev. A 76, 022304 (2007). 

[26] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cam- 
bridge University Press, 2000). 

[27] A. R. Calderbank, E. M. Rains, P W. Shor and N. J. A. Sloane, Phys. Rev. Lett. 78 (1997) 
405-408. 

[28] D. Gottesman, Stabilizer code and quantum error-correction, Caltech Ph.D. Thesis (1997). 

[29] Y. Shi, quant-ph/0205TT5| 

[30] A. Y. Kitaev, quant-ph/95II026| 

[31] D. Shepherd, |arXiv:quant-ph/0508 153] 



14 



